Demo: Geo Mapping with Plotly - SWISS Choropleth Maps

The process of producing visual representations of geographic data, frequently on a map or in another geographical context, is referred to as geo-mapping. Depending on their geographic coordinates, points, lines, or areas can be displayed on a map.

Choropleth Map

Choropleth Maps display divided geographical areas or regions that are coloured, shaded or patterned in relation to a data variable. This provides a way to visualise values over a geographical area, which can show variation or patterns across the displayed location.

The data variable uses colour progression to represent itself in each region of the map. Typically, this can be a blending from one colour to another, a single hue progression, transparent to opaque, light to dark or an entire colour spectrum.

One downside to the use of colour is that you can't accurately read or compare values from the map.
Another issue is that larger regions appear more emphasised then smaller ones, so the viewer's perception of the shaded values are affected.
A common error when producing Choropleth Maps is to encode raw data values (such as population) rather than using normalized values (calculating population per square kilometre for example) to produce a density map.

image.png

Plotly

Plotly is an open source library for Python. It integrates greatly with Jupyter Notebook and Dash to create interactive content for websites. It can be used to create ad hoc charts and professional content.

Handling spatial data for Geo Mapping needs some additional datasets and basic setup before you can use it for visual data representation and analysis.

E.g.Choropleth Charts are colored shapes and those shapes are of course manyfold. If you read through Plotly Documentation you can see that there are default shapes available in plotly for USA States and Countries as defined in the Natural Earth datasets.

To visualize data for Switzerland geo regions like cantons, districts or municipalities, we must explore other data sources for corresponding shapes to handle with.


Source/Links:

History:


Introduction

What to consider when creating choropleth maps

Maps are not objective, but a version of reality. When creating them, lots of choices are made: What to map, how to map and whether or not to use a map in the first place?

How to make your choropleth maps better

Consider a sequential color scheme if you want to drive attention to the high values, e.g. for unemployment rates. Consider a diverging scheme if you want to drive the attention to both extremes of the scale, e.g. too show the difference in votes between two competing parties. With any color scheme, do use colorblind-friendly colors.

image.png

What is GeoPandas?


image.png

The goal of GeoPandas is to make working with geospatial data in python easier. It combines the capabilities of pandas and shapely, providing geospatial operations in pandas and a high-level interface to multiple geometries to shapely. GeoPandas enables you to easily do operations in python that would otherwise require a spatial database such as PostGIS.

GeoPandas, as the name suggests, extends the popular data science library pandas by adding support for geospatial data.

The core data structure in GeoPandas is the geopandas.GeoDataFrame, a subclass of pandas.DataFrame, that can store geometry columns and perform spatial operations.

The geopandas.GeoSeries, a subclass of pandas.Series, handles the geometries.

Therefore, your GeoDataFrame is a combination of pandas.Series, with traditional data (numerical, boolean, text etc.), and geopandas.GeoSeries, with geometries (points, polygons etc.).

You can have as many columns with geometries as you wish; there’s no limit typical for desktop GIS software.

image.png


Color Palettes for Choropleth Maps

# Load PLOTLY library
import plotly.express as px
# List of diverging color palettes for choropleth maps
fig = px.colors.diverging.swatches_continuous()
fig.show()

Example: Swiss Choropleth Plot Map using Swiss Cantons GeoFrame and BFS DataFrame

1) Load the Swiss Cantons raw GeoData and Pre-Process the correspnding geodf GeoFrame

# Load GEOPANDAS library
import geopandas as gpd
# read Swiss Cantons Geometry Data (as GeoFrame)
geofilePATH = 'https://raw.githubusercontent.com/sawubona-repo/BINA-FS24-WORK/master/zDiversExamples/Notebook-GeoMapping/DATA/'
geofileNAME = 'ch-cantons-new.geojson'

# Read GeoJSON geometry data into geopandas GeoDataFrame
raw_geodf = gpd.read_file(geofilePATH+geofileNAME)

raw_geodf.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 51 entries, 0 to 50
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   KantonNr    51 non-null     int64   
 1   KantonTeil  51 non-null     object  
 2   KantonName  51 non-null     object  
 3   geometry    51 non-null     geometry
dtypes: geometry(1), int64(1), object(2)
memory usage: 1.7+ KB
raw_geodf.head()
KantonNr KantonTeil KantonName geometry
0 18 0 Graubünden POLYGON ((8.87705 46.81291, 8.87783 46.81308, ...
1 2 1 Bern POLYGON ((8.04694 46.78711, 8.05029 46.78834, ...
2 23 0 Valais POLYGON ((8.38472 46.45216, 8.38364 46.45151, ...
3 22 1 Vaud POLYGON ((7.07124 46.20101, 7.06621 46.19987, ...
4 21 0 Ticino POLYGON ((8.38472 46.45216, 8.38474 46.45231, ...

2) Load the Swiss Federal Statistics (BFS) raw Data and Pre-Process the correspnding bfsdf DataFrame

see BFS for the original data source

import pandas as pd
# read BFS Data "Ständige Wohnbevölkerung, Stand 2022"
bfsfilePATH = 'https://raw.githubusercontent.com/sawubona-repo/BINA-FS24-WORK/main/zDiversExamples/DATA/BFS/'
bfsfileNAME = 'BFS_CH-Kantone-Staendige-Wohnbevoelkerung-2022.csv'
raw_bfs_file = bfsfilePATH + bfsfileNAME

# Read CSV file into DataFrame
raw_bfsdf = pd.read_csv(raw_bfs_file, sep=';')

raw_bfsdf.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26 entries, 0 to 25
Data columns (total 21 columns):
 #   Column                                           Non-Null Count  Dtype 
---  ------                                           --------------  ----- 
 0   KantonNr                                         26 non-null     int64 
 1   KantonKürzel                                     26 non-null     object
 2   Kanton                                           26 non-null     object
 3   Total                                            26 non-null     int64 
 4   Alter-0–19                                       26 non-null     int64 
 5   Alter-20–64                                      26 non-null     int64 
 6   Alter-65-und-mehr                                26 non-null     int64 
 7   Mann                                             26 non-null     int64 
 8   Frau                                             26 non-null     int64 
 9   Schweizer                                        26 non-null     int64 
 10  Ausländer                                        26 non-null     int64 
 11  Ledig                                            26 non-null     int64 
 12  Verheiratet                                      26 non-null     int64 
 13  Verwitwet                                        26 non-null     int64 
 14  Geschieden                                       26 non-null     int64 
 15  Unverheiratet                                    26 non-null     int64 
 16  EingetragenePartnerschaft                        26 non-null     int64 
 17  AufgelöstePartnerschaft                          26 non-null     int64 
 18  StädtischerKernraum                              26 non-null     int64 
 19  EinflussgebietstädtischerKerne                   26 non-null     int64 
 20  GebieteausserhalbEinflussgebietstädtischerKerne  26 non-null     int64 
dtypes: int64(19), object(2)
memory usage: 4.4+ KB
raw_bfsdf.head()
KantonNr KantonKürzel Kanton Total Alter-0–19 Alter-20–64 Alter-65-und-mehr Mann Frau Schweizer ... Ledig Verheiratet Verwitwet Geschieden Unverheiratet EingetragenePartnerschaft AufgelöstePartnerschaft StädtischerKernraum EinflussgebietstädtischerKerne GebieteausserhalbEinflussgebietstädtischerKerne
0 19 AG Aargau 711232 144909 433224 133099 357753 353479 524909 ... 308288 310713 31059 60103 32 803 212 388542 222279 100411
1 16 AI Appenzell I. Rh. 16416 3392 9674 3350 8408 8008 14510 ... 7420 7045 855 1083 1 9 3 0 0 16416
2 15 AR Appenzell A. Rh. 55759 11365 32747 11647 28114 27645 46395 ... 24016 23935 2744 4983 1 66 14 15744 26922 13093
3 2 BE Bern 1051437 200509 620777 230151 516805 534632 872661 ... 469022 431354 54214 94967 57 1389 416 551861 234361 265215
4 13 BL Basel-Landschaft 294417 56709 170422 67286 144441 149976 223876 ... 121799 130051 15814 26126 13 504 104 199822 87691 6904

5 rows × 21 columns


3) Feature Engineering and Reducing Data for further analysis. Define a selection bfsdf_sel of the entire BFS DataFrame

bfsdf = raw_bfsdf

# Rename and Homogenize feature/column names in the dataframe
bfsdf.rename(columns = {'Total':'Einwohner_2022'}, inplace = True)

bfsdf.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26 entries, 0 to 25
Data columns (total 21 columns):
 #   Column                                           Non-Null Count  Dtype 
---  ------                                           --------------  ----- 
 0   KantonNr                                         26 non-null     int64 
 1   KantonKürzel                                     26 non-null     object
 2   Kanton                                           26 non-null     object
 3   Einwohner_2022                                   26 non-null     int64 
 4   Alter-0–19                                       26 non-null     int64 
 5   Alter-20–64                                      26 non-null     int64 
 6   Alter-65-und-mehr                                26 non-null     int64 
 7   Mann                                             26 non-null     int64 
 8   Frau                                             26 non-null     int64 
 9   Schweizer                                        26 non-null     int64 
 10  Ausländer                                        26 non-null     int64 
 11  Ledig                                            26 non-null     int64 
 12  Verheiratet                                      26 non-null     int64 
 13  Verwitwet                                        26 non-null     int64 
 14  Geschieden                                       26 non-null     int64 
 15  Unverheiratet                                    26 non-null     int64 
 16  EingetragenePartnerschaft                        26 non-null     int64 
 17  AufgelöstePartnerschaft                          26 non-null     int64 
 18  StädtischerKernraum                              26 non-null     int64 
 19  EinflussgebietstädtischerKerne                   26 non-null     int64 
 20  GebieteausserhalbEinflussgebietstädtischerKerne  26 non-null     int64 
dtypes: int64(19), object(2)
memory usage: 4.4+ KB
# Define selection for further analysis and visualization
bfsdf_sel = bfsdf[['KantonNr', 'Kanton', 'Einwohner_2022', 'Alter-0–19', 'Alter-65-und-mehr', 'Mann', 'Frau', 'Geschieden', 'Schweizer', 'StädtischerKernraum']]

bfsdf_sel.head()
KantonNr Kanton Einwohner_2022 Alter-0–19 Alter-65-und-mehr Mann Frau Geschieden Schweizer StädtischerKernraum
0 19 Aargau 711232 144909 133099 357753 353479 60103 524909 388542
1 16 Appenzell I. Rh. 16416 3392 3350 8408 8008 1083 14510 0
2 15 Appenzell A. Rh. 55759 11365 11647 28114 27645 4983 46395 15744
3 2 Bern 1051437 200509 230151 516805 534632 94967 872661 551861
4 13 Basel-Landschaft 294417 56709 67286 144441 149976 26126 223876 199822
# Add additional features, proportional values etc.

# Calculate the fraction of "Schweizer" on the "Anzahl_Einwohner_2022"
bfsdf_sel["zAnteil_Schweizer"] = bfsdf_sel.apply(lambda row: row['Schweizer'] / row['Einwohner_2022'], axis=1)

# Calculate the ratio of "Senior" (Alter-65-und-mehr) versus "Junior" (Alter-0–19)
bfsdf_sel["zRatio_SeniorJunior"] = bfsdf_sel.apply(lambda row: row['Alter-65-und-mehr'] / row['Alter-0–19'], axis=1)

# Calculate the ratio of "Mann" versus "Frau"
bfsdf_sel["zRatio_MannFrau"] = bfsdf_sel.apply(lambda row: row['Mann'] / row['Frau'], axis=1)

# Calculate the fraction of "Schweizer" on the "Anzahl_Einwohner_2022"
bfsdf_sel["zAnteil_Geschieden"] = bfsdf_sel.apply(lambda row: row['Geschieden'] / row['Einwohner_2022'], axis=1)

# Calculate the ratio of "StädtischerKernraum" versus "Anzahl_Einwohner_2022"
bfsdf_sel["zRatio_Städtisch"] = bfsdf_sel.apply(lambda row: row['StädtischerKernraum'] / row['Einwohner_2022'], axis=1)
<ipython-input-78-00486567e4bc>:4: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

<ipython-input-78-00486567e4bc>:7: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

<ipython-input-78-00486567e4bc>:10: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

<ipython-input-78-00486567e4bc>:13: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

<ipython-input-78-00486567e4bc>:16: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

bfsdf_sel.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26 entries, 0 to 25
Data columns (total 15 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   KantonNr             26 non-null     int64  
 1   Kanton               26 non-null     object 
 2   Einwohner_2022       26 non-null     int64  
 3   Alter-0–19           26 non-null     int64  
 4   Alter-65-und-mehr    26 non-null     int64  
 5   Mann                 26 non-null     int64  
 6   Frau                 26 non-null     int64  
 7   Geschieden           26 non-null     int64  
 8   Schweizer            26 non-null     int64  
 9   StädtischerKernraum  26 non-null     int64  
 10  zAnteil_Schweizer    26 non-null     float64
 11  zRatio_SeniorJunior  26 non-null     float64
 12  zRatio_MannFrau      26 non-null     float64
 13  zAnteil_Geschieden   26 non-null     float64
 14  zRatio_Städtisch     26 non-null     float64
dtypes: float64(5), int64(9), object(1)
memory usage: 3.2+ KB
bfsdf_sel.head()
KantonNr Kanton Einwohner_2022 Alter-0–19 Alter-65-und-mehr Mann Frau Geschieden Schweizer StädtischerKernraum zAnteil_Schweizer zRatio_SeniorJunior zRatio_MannFrau zAnteil_Geschieden zRatio_Städtisch
0 19 Aargau 711232 144909 133099 357753 353479 60103 524909 388542 0.738028 0.918501 1.012091 0.084505 0.546294
1 16 Appenzell I. Rh. 16416 3392 3350 8408 8008 1083 14510 0 0.883894 0.987618 1.049950 0.065972 0.000000
2 15 Appenzell A. Rh. 55759 11365 11647 28114 27645 4983 46395 15744 0.832063 1.024813 1.016965 0.089367 0.282358
3 2 Bern 1051437 200509 230151 516805 534632 94967 872661 551861 0.829970 1.147834 0.966656 0.090321 0.524864
4 13 Basel-Landschaft 294417 56709 67286 144441 149976 26126 223876 199822 0.760404 1.186514 0.963094 0.088738 0.678704

4) Combine GeoFrame and DataFrame for Visual Data Analysis and Geomapping

# Merge GeoFrame with DataFrame on corresponding "KantonNr" column values (see "inner join" operation)
joined_geodf = pd.merge(raw_geodf, bfsdf_sel, on=["KantonNr"])

# drop column "Kanton" as these values are identical to "KantonName" (a second primaryKey attribute)
joined_geodf.drop(["Kanton"], axis=1, inplace=True)
joined_geodf.info()
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 51 entries, 0 to 50
Data columns (total 17 columns):
 #   Column               Non-Null Count  Dtype   
---  ------               --------------  -----   
 0   KantonNr             51 non-null     int64   
 1   KantonTeil           51 non-null     object  
 2   KantonName           51 non-null     object  
 3   geometry             51 non-null     geometry
 4   Einwohner_2022       51 non-null     int64   
 5   Alter-0–19           51 non-null     int64   
 6   Alter-65-und-mehr    51 non-null     int64   
 7   Mann                 51 non-null     int64   
 8   Frau                 51 non-null     int64   
 9   Geschieden           51 non-null     int64   
 10  Schweizer            51 non-null     int64   
 11  StädtischerKernraum  51 non-null     int64   
 12  zAnteil_Schweizer    51 non-null     float64 
 13  zRatio_SeniorJunior  51 non-null     float64 
 14  zRatio_MannFrau      51 non-null     float64 
 15  zAnteil_Geschieden   51 non-null     float64 
 16  zRatio_Städtisch     51 non-null     float64 
dtypes: float64(5), geometry(1), int64(9), object(2)
memory usage: 6.9+ KB
joined_geodf.head()
KantonNr KantonTeil KantonName geometry Einwohner_2022 Alter-0–19 Alter-65-und-mehr Mann Frau Geschieden Schweizer StädtischerKernraum zAnteil_Schweizer zRatio_SeniorJunior zRatio_MannFrau zAnteil_Geschieden zRatio_Städtisch
0 18 0 Graubünden POLYGON ((8.87705 46.81291, 8.87783 46.81308, ... 202538 35349 46125 101760 100778 17628 162686 66483 0.803237 1.304846 1.009744 0.087036 0.328250
1 2 1 Bern POLYGON ((8.04694 46.78711, 8.05029 46.78834, ... 1051437 200509 230151 516805 534632 94967 872661 551861 0.829970 1.147834 0.966656 0.090321 0.524864
2 2 2 Bern POLYGON ((7.55835 47.32237, 7.55716 47.32262, ... 1051437 200509 230151 516805 534632 94967 872661 551861 0.829970 1.147834 0.966656 0.090321 0.524864
3 2 3 Bern POLYGON ((7.12304 46.90020, 7.12136 46.90051, ... 1051437 200509 230151 516805 534632 94967 872661 551861 0.829970 1.147834 0.966656 0.090321 0.524864
4 2 4 Bern POLYGON ((7.09086 46.90382, 7.09089 46.90336, ... 1051437 200509 230151 516805 534632 94967 872661 551861 0.829970 1.147834 0.966656 0.090321 0.524864

5) Create Choropleth Geomaps for Visual Data Analysis

# Load PLOTLY library
import plotly.express as px
geodf = joined_geodf

A) Choropleth Map for Citizenship Ratio in Swiss Cantons

# Create Choropleth GeoMap with Population Data (Feature "zAnteil_Schweizer")
fig = px.choropleth_mapbox(
    geodf,
    geojson=geodf.geometry,
    locations=geodf.index,
    color='zAnteil_Schweizer',                                   # define feature variable
    color_continuous_scale=px.colors.diverging.Geyser,           # define color palette
    labels={'zAnteil_Schweizer':'Anteil Schweizer an der Gesamtbevölkerung 2022'},

    hover_name='KantonNr',                                       # define mouse over infos
    hover_data={'KantonNr':True, 'KantonName':True, 'Einwohner_2022':True, 'zAnteil_Schweizer':True},
    opacity=0.5,
    center=dict(lat=46.94809, lon=7.44744),                      # set capital Bern as map center
    zoom=6.5,
    mapbox_style="carto-positron"                                # other option "open-street-map"
)

fig.update_layout(margin=dict(l=0, r=0, t=0, b=0))

fig.show()

B) Choropleth Map for Senior-Junior Ratio (Age 65 and older vs. Age 20 and younger) in Swiss Cantons

# Create Choropleth GeoMap with Population Data (Feature "zRatio_MannFrau")
fig = px.choropleth_mapbox(
    geodf,
    geojson=geodf.geometry,
    locations=geodf.index,
    color='zRatio_SeniorJunior',                                     # define feature variable
    color_continuous_scale=px.colors.diverging.balance,                 # define color palette
    labels={'zRatio_SeniorJunior':'Ratio Alter 65+ vs. 20- in der Gesamtbevölkerung 2022'},

    hover_name='KantonNr',
    hover_data={'KantonNr':True, 'KantonName':True, 'Einwohner_2022':True, 'zRatio_SeniorJunior':True},
    opacity=0.5,
    center=dict(lat=46.94809, lon=7.44744),                      # set capital Bern as map center
    zoom=6.5,
    mapbox_style="carto-positron"                                # other options "open-street-map"
)

fig.update_layout(margin=dict(l=0, r=0, t=0, b=0))

fig.show()

C) Choropleth Map for Gender Ratio in Swiss Cantons

# Create Choropleth GeoMap with Population Data (Feature "zRatio_MannFrau")
fig = px.choropleth_mapbox(
    geodf,
    geojson=geodf.geometry,
    locations=geodf.index,
    color='zRatio_MannFrau',                                     # define feature variable
    color_continuous_scale=px.colors.diverging.RdBu,             # define color palette
    labels={'zRatio_MannFrau':'Ratio Mann vs. Frau in der Gesamtbevölkerung 2022'},

    hover_name='KantonNr',
    hover_data={'KantonNr':True, 'KantonName':True, 'Einwohner_2022':True, 'zRatio_MannFrau':True},
    opacity=0.5,
    center=dict(lat=46.94809, lon=7.44744),                      # set capital Bern as map center
    zoom=6.5,
    mapbox_style="carto-positron"                                # other options "open-street-map"
)

fig.update_layout(margin=dict(l=0, r=0, t=0, b=0))

fig.show()

D) Choropleth Map for Divorced Ratio in Swiss Cantons

# Create Choropleth GeoMap with Population Data (Feature "zAnteil_Geschieden")
fig = px.choropleth_mapbox(
    geodf,
    geojson=geodf.geometry,
    locations=geodf.index,
    color='zAnteil_Geschieden',                                  # define feature variable
    color_continuous_scale=px.colors.diverging.PuOr,             # define color palette
    labels={'zAnteil_Geschieden':'Anteil geschiedene Personen in der Gesamtbevölkerung 2022'},
    hover_name='KantonNr',
    hover_data={'KantonNr':True, 'KantonName':True, 'Einwohner_2022':True, 'zAnteil_Geschieden':True},
    opacity=0.5,
    center=dict(lat=46.94809, lon=7.44744),                      # set capital Bern as map center
    zoom=6.5,
    mapbox_style="carto-positron"                                # other options "open-street-map"
)

fig.update_layout(margin=dict(l=0, r=0, t=0, b=0))

fig.show()

E) Choropleth Map for Urban Population Ratio

# Create Choropleth GeoMap with Population Data (Feature "zRatio_Städtisch")
fig = px.choropleth_mapbox(
    geodf,
    geojson=geodf.geometry,
    locations=geodf.index,
    color='zRatio_Städtisch',                                     # define feature variable
    color_continuous_scale=px.colors.diverging.Geyser,              # define color palette
    labels={'zRatio_Städtisch':'Ratio städtische Population an der Gesamtbevölkerung 2022'},
    hover_name='KantonNr',
    hover_data={'KantonNr':True, 'KantonName':True, 'Einwohner_2022':True, 'zRatio_Städtisch':True},
    opacity=0.5,
    center=dict(lat=46.94809, lon=7.44744),                      # set capital Bern as map center
    zoom=6.5,
    mapbox_style="open-street-map"                                # other options "carto-positron"
)

fig.update_layout(margin=dict(l=0, r=0, t=0, b=0))

fig.show()